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Abstract. Some of the important inequalities associated with quan- 
tum entropy are immediate algebraic consequences of the Hansen-Peder- 
sen- Jensen inequalities. A general argument is given using matrix per- 
spectives of operator convex functions. A matrix analogue of Marechal's 
extended perspectives provides additional inequalities, including a p + 
q < 1 result of Lieb. 



1. Introduction 

In 1973, Elliott Lieb published a ground-breaking paper on operator in- 
equalities [10] . This and a subsequent paper by Lieb and Ruskai [11] have 
had a profound effect on quantum statistical mechanics, and more recently 
on quantum information theory. Since then, a number of attempts have 
been made to elucidate and extend these results. Two particularly elegant 
examples are those of Nielsen and Petz [15] , and Ruskai p2]); which use 
the analytic representations for operator convex functions. On the other 
hand, Frank Hansen [7] has developed a powerful theory that utilizes geo- 
metric means of positive operators. The latter noton was formulated by 
Pusz and Woronowicz [16], and subsequently investigated by Ando PQ (see 
the discussion in Section 3) and by Kubo and Ando [9]. 

Here we present what is arguably the simplest approach to these inequali- 
ties. This is accomplished by using matrix analogues of two elementary ideas 
from classical convexity theory: the Jensen inequality, and the construction 
of the perspective of a convex function. For the first, we employ the ma- 
tricial Jensen inequality of Frank Hansen and Gert Pedersen [5] , [6] . As we 
point out in Section 5, the affine and homogeneous versions of this inequal- 
ity can be proved in a relatively few lines drawn from those papers. The 
non-commutative analogues of perspectives are completely straight-forward 
in the context of the left and right module operations that are standard to 
the subject. In section 4 we show that the same approach may be used to 
quantize Marechal's extended version of the perspective. We apply this to 
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prove Lieb's generalized p + q < 1 inequality (see also the elegant proof in 
0). 

The appearance of notions from convexity theory suggests that other geo- 
metric techniques will prove useful in the operator context. In a different 
direction, quantum information theory is likely to have an impact on the 
theory of matrix convexity. This possility is considered in Section 4. 

I am grateful to Frank Hansen for alerting us to his work in this area. 
I also wish to thank Mary Beth Ruskai, who corrected a number of errors 
in my first manuscript, Jon Tyson for a host of suggestions, and Richard 
Kadison for his encouragement. 

Since the basic difficulties are already apparent in finite dimensions, we 
have restricted our attention to finite matrices, and we have avoided any 
attempt at full generality even in that context. 

2. The classical and matrix notions of perspectives 

Given a convex function / defined on a convex set K C M. n , the perspective 
g is defined on the subset 

L = {(x, t) : t > and x/t G K} 

by 

g(x,t) = f(x/t)t 

(see [8]). It is a simple exercise to verify that g(x,t) is a jointly convex 
function in the sense that if < c < 1, then 

g{cxi + (1 - c)x 2 ,ct 1 + (1 - c)t 2 ) < cg(xi,h) + (1 - c)g(x 2 ,t 2 ). 

An elementary but important example is provided by the continuous convex 
function f(x) = xlogx, with /(0) = defined on [0, oo) C M. It follows that 
the perspective function 

g(x, t) = t— log - = x log x — x log t 

is jointly convex. Letting p = (pi) and q = (qi) be finite probability measures 
with pi > and qi > 0, the convexity of / implies that the classical entropy 

H{p) = - ^Pilogpi 

is concave, and the convexity of g implies that the relative entropy 

(q,p) ^ H(q\\p) = ^pilogpi -pilogq-i 

is jointly convex on pairs of probability measures. 

We recall that if / : I = [a, b] — > R is continuous, and T is an n x n self- 
adjoint matrix with spectrum in [a, b], then we can define f n (T) by spectral 
theory (or by using a basis in which T is diagonal). / is said to be matrix 
convex if for each n £ N, the corresponding function /„ is convex on the self- 
adjoint n x n matrices with spectrum in [a, b]. Throughout the rest of the 
paper we only consider n x n matrices, and we usually omit the subscript n. 



MATRIX CONVEXITY 



3 



The following is the affine version of the Hansen-Pedersen-Jensen inequality 
[6] (see Section 5). 

Theorem 2.1. If f is matrix convex, and A and B satisfy A* A+ B* B = /„, 
then 

(2.1) f(A*T x A + B*T 2 B) < A*f(T 1 )A + B*f(T 2 )B. 

We begin with some matrix conventions. Given matrices L and R, we let 
[L, R] = LR - RL. Let us suppose that L > and R > 0. If [L, R] = 0, 
i.e., the matrices commute, then we may find a basis in which both matrices 
are diagonalized. It follows that LR > 0, [L, R^ 1 ] = 0, and we may unam- 
biguously write ji for the quotient. We also recall that for any continuous 
function /, f(L) commutes with any operator commuting with L (including 
L itself). Using simultaneously diagonalized matrices, it is evident that we 
have relations such as log Li? -1 = logL — log-R. 

Theorem 2.2. Suppose that f is operator convex. When restricted to 
positve commuting matrices L, R, the "perspective function" 

(2.2) (L,R)^g(L,R) = f(j^R 

is jointly convex in the sense that if L = ch\ + (1 — c]L 2 and R = cR\ + 
(1 — c)R 2 where [Lj, Rj] =0 (j = 1, 2), and < c < 1, then 

(2.3) g(L,R) <cg(L 1 ,R 1 ) + (l-c)g(L 2 ,R 2 ). 

Proof The matrices A = (c^i) 1 / 2 R~ 1 / 2 and B = {{l-c)R 2 ) 1 / 2 R- 1 / 2 satisfy 
A* A + B*B = I. From Theorem 2.1, 

g(L,R) 

-Mi 

- Ri/2 f{ A ik) A+B ik) B ) Ri/2 

< R 1/2 (^f^A + B*f^B^Ri/ 2 

= (cRi) 1/2 f (^) (ci?i) 1/2 + ((1 - c)R 2 y/ 2 f (|) ((1 - c)R 2 ) 
= cg(L 1 ,R 1 ) + (l-c)g(L 2 ,R 2 ). 



1/2 



□ 

The following result is due to Lieb and Ruskai [11] (a related early dis- 
cussion may be found in Lindblad [12]). 
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Corollary 2.3. The relative entropy function 

(p,cr) i — * 5(p||cr) = Trace p log /) — plogcr 
is jointly convex on the strictly positive n x n density matrices p, a. 

Proof. We let M n have the usual Hilbert space structure determined by 
(X, Y) = Trace XY*. Given positive density matrices a and p, we define 
operators R and L on M n by L{X) = pX and R(X) = Xa. Then we have 
that L(X) and R(X) are commuting positive operators on the Hilbert space 
M n . On the other hand the function f(x) = xlogx is operator convex (see 
[2J, p. 123), and thus 

(g(L,R)(I),I) = (fl(|)l°6 (!)(/),/) 

= (LQogL- log R) (I), I) 
= Trace/) log p — p logo" = 5(p||cr) 
is jointly convex. □ 

The following is due to Lieb [10]. It was subsequently used by Lieb and 
Ruskai to prove strong subadditivity for relative entropy |11| . A stronger 
result of Lieb is discussed in the next section. 

Corollary 2.4. 7/0 < s < 1, then the function 

F(A, B) = Trace A S K*B 1 ~ S K 

is jointly concave on the strictly positive n x n matrices A,B. 

Proof. Since f(t) = —t s is operator convex (see [2] Th.5.1.9), g(L,R) = 
—L s R l ~ s is jointly convex for appropriately commuting operators. Again 
using the Hilbert space structure on M n , we let L{X) = AX and R(X) = 
XB. It follows that 

(A,B) ^ —Trace A S K*B 1 ~ S K = {g(L, R)(K*), K*) 
is jointly convex. □ 

Various generalized entropies may be handled in much the same manner. 

3. Marechal's perspectives 

P. Marechal has recently introduced an interesting generalization of per- 
spectivity for convex functions [13j . |14j . This also has a natural matrix 
version. For this purpose we use the subhomogeneous form of the Hansen- 
Pedersen- Jensen inequality [5] (see Section 5). We assume that the functions 
/ and g are defined on an interval / CI, and that 6/. 

Theorem 3.1. // / is matrix convex, and /(0) < 0, and that A and B are 
matrices with A* A + B*B < I n , then 

f{A*T x A + B*T 2 B) < A*f(T 1 )A + B*f(T 2 )B. 
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Given continuous functions / and h, and commuting positive matrices L 
and R, we define 

(fAh)(L,R) = f(-^\ h(R) 

A close variation of the following result was proved for operator monotone 
functions / on (0, oo) by Ando (see [1] Theorem 6). His construction (with- 
out the extra function $2 which can be incorporated with a compostion) is 
related to Marechal's operation fVh for concave functions / and h. Ando 
invoked the integral representation for operator monotone functions, rather 
than the matrix convexity argument used below. 

Theorem 3.2. Suppose that f is matrix convex, /(0) < and that h is 
matrix concave with h > 0. Then (L,R) 1— > (fAh)(L,R) is jointly convex on 
postive commuting matrices L,R in the sense of Theorem 2.2. 

Proof. Let us suppose that L = cL\ + (1 — c)L2 and R = cR\ + (1 — c)R 2 
where [Lj,Rj] = 0. Then c/t(i?i) + (1 - c)h(R 2 ) < h(R), hence 

A = c 1/2 h{Ri) 1/2 h(RY l/2 

B = {l-c) l ' 2 h{R 2 ) l l 2 h{R)- 1 / 2 

satisfy 

A* A + B*B 

= h(R)- 1 / 2 ch(R 1 )h(R) 1 / 2 + hiR)- 1 ' 2 ^ - c)h(R 2 )h{R)- 1 / 2 
< h{R)- l ' 2 h{R)h{R)- l ' 2 I = I. 
It follows from Theorem 3.1 that 
(fAh)(L,R) 
= hiR^fihiRy^LhiRy^hiR) 1 / 2 

- h ^ i2 i A iwo) A+B im)) B ) hiR)1 ' 2 

< h(R)'f*A'f (J^j Ah(R)"* + ft(B) 1/2 B7 (^y) Bh(R) 1 '* 

- cA(ft)I/2/ Uk) h ^ )l ' 2 + (I - M ' 2f inm) hiR ^' 2 

= c{fAh){L x ,R x ) + (1 - c)(fAh)(L 2 ,R 2 ). 

□ 

To illustrate this result, we reprove Lieb's extension of Corollary 2.4 [10J. 
Corollary 3.3. Suppose that < p, q and that p + q < 1. Then the function 

(A, B) Trace A q X*B p X 
is jointly concave on the positive n x n matrices. 



EDWARD G. EFFROS 



Proof. Since p + q < 1, p + g is a convex combination of g and 1, i.e., we 
may choose < t < 1 with p + g = (1 — t)q + tl. If we let q = s, then 

p = -ig + t = (1 - g)i = (1 - s)t. 

Thus it suffices to show that if < s, t < 1, then 

(A 5) i — ^ —Trace A S X*B^ 1_S ^X 

is jointly convex. The functions f(x) = —x s and h(y) = y are operator 
convex and concave, respectively, and 

(fAh)(L,R) = h(R)f (JL?) = -tfZL = -VIP-* 

If we let L(X) = AX and R{X) = XB for X € M n , then from Theorem 
3.2, 

(A,B) ' — * —Trace A s X*s( 1-s ^X = ((fAh)(L,R)(X*),X*) 
is jointly convex. □ 

4. MATRIX CONVEXITY 

Perhaps the most intriguing aspect of Marechal's construction is that it 
behaves well under the Fenchel-Legendre transform, and under iteration. 
S0ren Winkler formulated an analogue of the Fenchel-Legendre duality for 
matrix convex functions [18J, but the transforms are generally set- valued 
mappings. Further progress might result if one could reformulate his the- 
ory in terms of "left-right" commuting pairs. It should also be noted that 
other constructions in classical convexity theory, such as the linear frac- 
tional transformations of convex functions (see [3]) might also have matrix 
generalizations . 

Until recently the theory of matrix convexity has suffered from a lack of 
examples and applications. With the advent of quantum information theory 
(QIT), this situation has dramatically changed. QIT provides a wealth of 
remarkable, purely non-classical techniques that might clarify some of the 
conceptual problems in matrix convexity theory. On the other hand, it seems 
likely that matrix convexity and more generally non-commutative functional 
analysis will provide an appropriate framework for many of the calculations 
in QIT. A striking illustration of this phenomenon can be found in [4]. 

5. a brief guide to the 
Hansen-Pedersen- Jensen Inequalities 

The original proof of Theorem 3.1 may be found in [5] (Theorem 2.1). 
It is both elegant and concise. For our purposes we only need (i) implies 
(hi) in their proof. On the other hand, Winkler pointed out in [18] that 
Theorem 2.1 is easily derived from Theorem 3.1. Since our situation is 
slightly different, we include the argument. 

We hx a point c G I and define F(t) = /(t+c)-/(c). Given T = T* € M n , 
we may choose a basis with respect to which T = diag(Ai, . . . , A n ). Then 
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/ r a 



) 



F(T) 



= F 



U A « 

/(Ai+c)-/(c) 




f{A*T 1 A + B*T 2 B) - /(c) J 
< A*f(T 1 )A - f(c)A*A + B*f(T 2 )B - f(c)B*B 



and thus 



f{A*T x A + B*T 2 B) < A*f(T 1 )A + B*f(T 2 )B. 



As pointed out by Winkler [18] . the result may be extended to rectangular 
matrices A and B. He used the case B = to show that a real function / on 
an interval in R is a matrix convex function if and only if the supergraphs 
of the f n form a matrix convex system of sets. 
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